Pharmacoepidemiology and Drug Safety
○ Wiley
Preprints posted in the last 30 days, ranked by how well they match Pharmacoepidemiology and Drug Safety's content profile, based on 13 papers previously published here. The average preprint has a 0.02% match score for this journal, so anything above that is already an above-average fit.
Carlisle, B. G.; Hutchinson, N.; Moyer, H.
Show abstract
Background: The global SARS-CoV-2 pandemic disrupted healthcare systems worldwide, raising concerns about its impact on clinical research. Early reports suggested reductions in participant enrollment, interruptions to ongoing trials, and challenges to protocol adherence, yet the magnitude and duration of these operational disruptions remain unclear. Methods: We conducted a registry-based analysis comparing clinical trials during the COVID-19 pandemic (December 2019 to November 2022) with a matched pre-pandemic cohort (December 2016 to November 2019). Studies were included if they reported any modifications to trial status, enrollment, or protocols during the study periods. Key variables included trial stoppage, enrollment changes, and adoption of remote or hybrid procedures. Results: The global SARS-CoV-2 pandemic resulted in widespread disruptions to trial operations with 13,323 clinical trials terminated, suspended or withdrawn over the course of the pandemic, a 38% increase compared to the 9,665 trials that stopped in the 3 years prior to the pandemic. Registries indicated a sharp decline in new participant enrollment across geographic regions and therapeutic areas, with partial recovery in later months. Review findings highlighted barriers including patient inaccessibility, staff redeployment, and supply chain interruptions. Conclusions: The pandemic caused system-wide operational shocks that compromised trial timelines and may have downstream methodological consequences. Recovery in enrollment does not imply restoration of pre-pandemic protocol fidelity or outcome ascertainment. Standardized reporting of disruptions, proactive contingency planning, and resilient trial designs are needed to maintain data integrity during large-scale disruptions and to support reliable evidence generation.
Leonard, S. A.; Dysart, K.; Callahan, A.; Siadat, S.; Zhang, J.; Handley, S. C.; Huybrechts, K. F.; Igbinosa, I.; Bateman, B. T.
Show abstract
Background: Epic Cosmos is a relatively new centralized electronic health record dataset with high potential utility in perinatal epidemiologic research. Objectives: The study objectives were to develop replicable steps to create longitudinal, linked maternal-infant cohorts in Cosmos, assess completeness of key variables, evaluate potential selection bias with restrictions for longitudinal healthcare encounters, and provide an example epidemiologic analysis. Methods: We created maternal-infant cohorts by starting with live births during 2023-2024 recorded in the BirthFact data table and joining with additional data tables as needed. We selected and created variables for perinatal characteristics, common comorbidities, and routinely measured vital signs and laboratory values, and assessed variable completeness. We sequentially restricted the birth cohort for maternal-infant linkage and longitudinal healthcare from first-trimester prenatal care encounter through infant follow-up care within 12 weeks post-discharge from birth hospitalization. Finally, we conducted an example analysis of the association between high systolic blood pressure in the first trimester ([≥]140 mm Hg) and later onset of preeclampsia among those with chronic hypertension. Results: The total linked birth cohort included 2,624,186 pregnancies. Completeness was >90% for most variables assessed but was 77% for racial and ethnic group and 76% for body mass index at delivery. Characteristics of the cohort were similar to those reported for the entire United States birth population based on birth certificate data, including similar regional and racial-ethnic composition. Longitudinal cohort restriction requiring linked records from first trimester prenatal care through infant follow-up care reduced the cohort size to 509,148 pregnancies. However, restriction had minimal effects on cohort characteristics. In the example analysis, high systolic blood pressure was associated with increased risk of preeclampsia among those with chronic hypertension (aRR: 1.26; 95% CI: 1.22, 1.30). Conclusions: This study provides a rigorous and reproducible approach to creating longitudinal, linked maternal-infant cohorts in Epic Cosmos and the analytical findings suggest high data quality and representativeness.
Raghavan, S.; Liu, W. G.; Ho, M. R.; Warsavage, T.; Ghosh, D.; Caplan, L.; Reusch, J. E.
Show abstract
Objectives: Diabetes affects over 500 million people globally and glycemia is inadequately managed. Metformin is the most frequently prescribed initial treatment for type 2 diabetes globally, yet glycemic response trajectories to metformin in routine real-world care and predictors of treatment response have not been well described. We aimed to identify glycemic response trajectories in adults prescribed metformin monotherapy as initial type 2 diabetes treatment and predictors of poor glycemic response to metformin. Design: Observational cohort study using latent class mixed models to identify hemoglobin A1c (HbA1c) trajectory classes, followed by random forests machine learning to predict trajectory class membership. Setting: US Veterans Affairs Healthcare System Participants: Adults treated with metformin alone for >30 days after diabetes diagnosis with a minimum of two HbA1c measurements from 90 days prior to two years after the first metformin prescription (N=140,413). Exposures: Demographic, laboratory, vital sign, and comorbidity data were included as predictors of metformin response trajectory Main Outcomes and Measures: We included all HbA1c measurements (487,604 total) for two years after metformin initiation to define metformin glycemic response trajectories. Results: We identified three HbA1c trajectories: stably low (89.7% of sample, mean HbA1c decrease from 7.2% to 6.6%), brisk response (7.1% of sample, mean HbA1c decrease from 11.4% to 7.0%), and non-response (3.1% of sample, mean HbA1c increase from 8.9% to 10.8%). Of those in the stably low and brisk response classes at 2 years, 91% maintained HbA1c at approximately 7% on metformin alone for 5 years after drug initiation. Prediction models could accurately predict brisk response (91% accuracy) but not metformin non-response (59% accuracy). Conclusions: Most individuals treated initially with metformin monotherapy have a beneficial and durable glycemic response. Predicting individuals who will not respond to metformin may be challenging but is evident within six months with recommended glycemic surveillance. The findings support current guidelines for HbA1c surveillance when initiating diabetes treatment.
Borges, M. C.; Urquijo, H.; Yang, Q.; van der Graaf, A.; McBride, N.; Haug, E. B.; Soares, A. G.; Clayton, G. C.; Bond, T. A.; Al Arab, M.; Horn, J.; Thomas, L.; Bhatta, L.; Asvold, B. O.; Magnus, M. C.; Evans, D. M.; Burden, C.; Birchenall, K.; Brumpton, B.; Gaunt, T. R.; Hart, E. C.; Kutalik, Z.; Lawlor, D. A.
Show abstract
Background and Aims Hypertension during pregnancy is a major cause of maternal and neonatal morbidity and mortality, yet the efficacy and safety of antihypertensive treatments in this setting remain uncertain. We evaluated the effects of antihypertensive drug targets on adverse pregnancy-related outcomes using genetic variants to instrument target perturbation. Methods We performed drug target Mendelian randomization to mimic pharmacological perturbation of targets from six commonly used antihypertensive drug classes, using data from up to 671,922 pregnant women. Genetic variants near drug target genes associated with systolic or diastolic blood pressure were selected as instruments. We estimated effects of target modulation on six primary and eight secondary pregnancy outcomes. Results Genetically instrumented downregulation of blood pressure through beta-blocker (BB) and calcium-channel blocker (CCB) targets, particularly ADRB1 and CACNB2, was associated with a reduced risk of hypertensive disorders of pregnancy, including preeclampsia. For example, CACNB2-instrumented lowering corresponded to a 7% (95% CI: 5-9%) reduction in preeclampsia risk per 1 mmHg decrease in blood pressure. For most other targets, estimates were directionally consistent but imprecise. Across additional outcomes, effects varied by target, with suggestive evidence for reduced risks of miscarriage, preterm birth, small-for-gestational-age birth, and labour induction, although these estimates were accompanied by substantial uncertainty. Conclusions These findings support a protective effect of BB and CCB targets on hypertensive disorders of pregnancy and highlight potential target-specific differences in safety. This work illustrates the value of Mendelian randomization in addressing clinical uncertainties where robust trial evidence is limited.
Irlmeier, R.; Jin, Z.; Ye, F.
Show abstract
Background Simon two-stage designs for binary endpoints and their time-to-event analogues, including the Kwak and Jung method, rely on a fixed null benchmark. Their Type I error control is valid only when that benchmark is correctly specified. In practice, historical benchmarks are often inconsistent due to small samples, population heterogeneity, changing eligibility criteria, and evolving standards of care. Even modest misspecifications can substantially inflate the Type I error rate, leading to costly advancement of ineffective treatments. Methods We propose the Interval-Null Robust (INR) two-stage design framework that accounts for uncertainty in the historical null benchmark. We define the null hypothesis as a plausible range of clinically uninteresting values: p[isin][p0L, p0U] for binary endpoints and {lambda}[isin][{lambda}0L, {lambda}0U] (or equivalent survival probabilities) for time-to-event endpoints. Type I error is controlled uniformly over the full null interval: sup{theta}[isin]{theta}0 Pr{theta}(Go) [≤] . Under the monotonicity of the Go probability, the supremum occurs at the least favorable null configuration - p0U and {lambda}0L - but the design is not reduced to a point-null formulation. The interval defines the uncertainty set for error control and is used in selecting among feasible designs through robust criteria such as worst-case regret or minimal average expected sample size. Results Across representative planning scenarios for both endpoint types, classic designs calibrated to a single benchmark exhibit substantial Type I error inflation when the true null parameter exceeds the assumed planning value. INR designs maintain the nominal Type I error rate across the full null interval, directly addressing this vulnerability to benchmark misspecification. The robustness-efficiency trade-off can be managed through design constraints and robust optimization criteria while preserving uniform Type I error control. Conclusions INR two-stage designs offer a transparent framework for addressing historical control uncertainty in single-arm Phase II trials. By replacing reliance on a fixed benchmark assumption with a more realistic interval of clinically plausible null values, INR design reduces the risk of false-positive Go-decisions caused by benchmark misspecification. INR applies to both binary and time-to-event endpoints and is implemented in the open-source INRDesign R package and accompanying interactive Shiny app.
Sun, H.; Jackson, S. E.; Xiao, L.; Cox, S.; Oldham, M.; Tattan-Birch, H. O.
Show abstract
Abstract Aims To examine which demographic groups nicotine pouch advertisers chose to target on social media, and which groups Meta's algorithms actually delivered the adverts to. Design Cross-sectional analysis of advert-level data from the Meta Ad Library. Setting Meta social media platforms (including Facebook and Instagram) in the UK. Cases A random sample of 741 nicotine pouch adverts shown in the 12 months up to December 2025, and a comparison sample of 1,125 general adverts. Analyses of reach were restricted to adverts eligible for all genders and adult ages (444 pouch adverts; 674 general). Measurements Outcomes were advertiser-set gender and age-group targeting criteria (i.e., groups eligible to be shown each advert) and estimated advert reach to each group (i.e., number of people who saw each advert). Male-to-female reach ratios within age groups, and reach ratios comparing age groups, were calculated per advert and summarised using geometric means. To assess whether patterns were pouch-specific, comparisons with general adverts were made using ratios of reach ratios (RRR). Findings Advertisers of nicotine pouches targeted a broad sample; most adverts (79.1%; 586/741) were eligible to be shown to all genders, the remainder were restricted to men only. All were restricted to adults (minimum age 18 years) and most (95.6%; 708/741) had no upper age limit. Despite this, of pouch adverts eligible to be shown to all adults, adverts were more likely to reach men, particularly among younger men. Among 18-24-year-olds, pouch adverts reached around ten times as many men as women (RR 10.0, 95% CI 8.7-11.5), compared with a slight skew towards women for general adverts (RR 0.81, 95% CI 0.71-0.94), corresponding to an RRR of 12.3 (95% CI 10.0-15.1). Pouch adverts also showed a skew in reach towards younger age groups. Relative to those aged 35-44 years, reach was higher among 18-24-year-olds for nicotine pouch adverts (RR 1.33, 95% CI 1.17-1.51) but much lower for general adverts (RR 0.19, 95% CI 0.17-0.21), corresponding to an RRR of 7.0 (95% CI 6.0-8.2). Conclusions Nicotine pouch adverts on social media are often eligible to be shown broadly to all demographic groups but are disproportionately delivered to young men.
Khan, M. M.; Anwar, M. N.
Show abstract
Background: Large language models (LLMs) are increasingly used in telehealth, but their safety in antibiotic prescribing remains uncertain, particularly in the presence of patient misinformation. Methods: A cross-sectional analytical study evaluated 5,000 responses from five chatbot models using 1,000 primary-care vignettes of mild infections. Guideline adherence, overprescribing, misinformation effects, and safety behaviors were assessed. Inappropriate prescriptions were classified using the WHO AWaRe framework. Results: Overall, 76.2% of responses were guideline-concordant, while 6.6% showed unprompted overprescribing and 17.2% were influenced by misinformation. Some models were more vulnerable to misinformation than others. Although most responses correctly noted that antibiotics do not treat viral infections, fewer advised consulting a doctor, and warnings against self-medication were rare. Many inappropriate prescriptions involved broad-spectrum antibiotics. Conclusion: LLMs show potential in telehealth but remain prone to misinformation and inappropriate prescribing. Stronger guideline integration and clinical oversight are necessary to ensure safe use. Keywords: antimicrobial stewardship; large language models; telehealth; antibiotic prescribing; misinformation; clinical safety
Roehrig, J.; Sutter, L.; Witsch, N.; Rademacher, L.; Cabanis, M.
Show abstract
Background and Aims: Synthetic opioids cause tens of thousands of deaths each year in North America, and there are indications that synthetic opioids are also becoming increasingly prevalent in the European drug market. This study aimed to examine high-risk substance use in the German drug-using community with a particular focus on the synthetic opioids fentanyl and nitazenes and related awareness, concerns, overdose experiences, and harm-reduction behavior. Design: Cross-sectional, observational online survey. Setting: Open drug-use scenes, addiction clinics, and substitution practices at numerous geographic locations throughout Germany, August to September 2025. Participants: 235 individuals aged 14+ from the drug using community (mean age 43.4 years; 57.9% male), 79.6% recruited by peers in open drug-use scenes. Measurements: The primary outcome was substances used within the past 12 months. In addition, sources, forms, routes of administration, and perceived changes in availability and price of (synthetic) opioids were assessed as well as risk perceptions, fears, harm-reduction behavior, and overdose-related experiences. Findings: 227 respondents reported substance use with an average of 6.2 substances, and 73.1% (95% confidence interval [CI] = 67.0-78.5%) had used at least one opioid in the past year. Synthetic opioids were consumed in many parts of Germany and across all age and gender groups. Among participants who experienced a shortage of their primary opioid in the past year, 25% (95% CI = 15.8-37.2%) reported having used fentanyl instead. 56.5% (95% CI = 36.8-74.3%) of individuals using synthetic opioids reported having experienced an overdose in the past twelve months. Most of the respondents perceived synthetic opioids as posing a high risk, and a substantial proportion expressed fear that they could be mixed into their own substances. However, only 9.9% (95% CI = 6.6-14.7%) use drug checking, although the vast majority stated they would use it if it were available to them. Conclusions: Synthetic opioids, including fentanyl and nitazenes, have entered the German drug scene, with users reporting high rates of overdose and limited access to harm reduction measures. Germany may be in an early phase of a synthetic opioid transition, warranting urgent expansion of surveillance, naloxone distribution, and drug checking services.
Kleinbloesem, C. H.; Braal, C. L.
Show abstract
Background Classical pharmacokinetic-pharmacodynamic (PK/PD) theory models exposure-effect in two dimensions: magnitude and time. Rate-dependent toxicity has been documented across therapeutic domains but never formalised as a conserved biological constraint. Methods We developed the Human Adaptive Rate Limit (HARL) framework, formalising the maximum tolerable velocity as |dS/dt|_max = sigma_max / tau. We validated HARL across five domains using published trial data and a reanalysis of the longitudinal biomarker data from the 202-patient CAR-T cohort of Wei et al (2023). An 8-ODE quantitative systems pharmacology model guided biomarker selection. Early biomarker velocities (maximum positive slope within days 0-5) were computed for ferritin and D-dimer. Patients were classified as high-risk only if both velocities exceeded their thresholds (dual-velocity classifier). Thresholds were identified by grid-search optimisation of the Youden index and assessed by leave-one-out cross-validation. Findings A prospective crossover study (Kleinbloesem 1987, n=8) demonstrated that matched steady-state nifedipine concentrations produce divergent haemodynamic responses depending solely on rate of rise, anticipating the dose-related mortality signal subsequently reported across ~8350 patients with coronary heart disease (Furberg 1995), a meta-analysis that was itself debated. Convergent evidence spans haematology (CHOIR, 1432 patients, hazard ratio [HR] 1.34 [1.03-1.74] for aggressive Hb correction), radiation (dose-rate effectiveness factor [DDREF] 1.5-2.0), and infusion pharmacology. In the CAR-T cohort, high-risk classification (ferritin >232 ng/mL per day AND D-dimer >1.21 mg/L per day) predicted severe CRS with 100% sensitivity (~78% specificity) in safety rule-out mode and 91.1% sensitivity (93.6% specificity, AUC 0.95 [95% CI 0.91-0.98]) in Youden-optimised mode. Median kinetic lead time was 4 days (range 3-7) before clinical decompensation. Interpretation Biological tolerability is three-dimensional. HARL unifies rate-dependent toxicity across domains spanning minutes to weeks. MTDyn--specifying target level and allowable rate of change--should supplement conventional dose-response assessment.
Sehgal, N. K. R.; Tronieri, J. S.; Rader, B.; Ungar, L.; Guntuku, S. C.
Show abstract
Gray-market retatrutide use is increasing, but patient safety experiences remain poorly characterized. This cross-sectional analysis examined Reddit posts and comments from retatrutide-specific and broader peptide or weight-management communities through December 2025. A validated large language model classified self-reported retatrutide use and extracted author-attributed symptoms mapped to MedDRA Preferred Terms. Among 13,589 users reporting current use, 7,823 had at least one mapped symptom after exclusions. Unlike phase 2 trial findings dominated by gastrointestinal events, Reddit reports most often described appetite increase, fatigue, increased energy, nausea, food craving, insomnia, and elevated heart rate. Findings are hypothesis-generating and warrant pharmacovigilance attention.
Xu, S.; Sy, L. S.; Hong, V.; Farrington, P.; Glenn, S. C.; Kim, S.; Ryan, D. S.; Tubert, J. E.; Tong, P.; Lewin, B. J.; Tseng, H. F.; Carbayo, A.; Davis, C.; Sangha, N. S.; Belongia, E. A.; Sundaram, M. E.; Nelson, J. C.; Daley, M. F.; Klein, N. P.; Fireman, B.; Haapala, J.; Hurley, L. P.; Irving, S. A.; Cocoros, N. M.; Weintraub, E. S.; Duffy, J.; Qian, L.
Show abstract
Background: The Vaccine Safety Datalink (VSD) detected a statistical signal for ischemic events (ischemic stroke or transient ischemic attack) following bivalent mRNA COVID-19 vaccination through prospective surveillance during 2022-2023. Although multiple studies from other surveillance systems and countries reported no increased risk, important methodological limitations remained. This U.S. study addressed those limitations by evaluating the ischemic stroke risk following bivalent mRNA COVID-19 vaccination, influenza vaccination, and their same-day coadministration using event-dependent self-controlled case series (SCCS) design. Methods: Study outcomes included first-ever ischemic stroke (primary outcome), first-in-1-year ischemic stroke (secondary outcome), and ischemic events (exploratory outcomes), identified using ICD-10-CM codes in inpatient and emergency department settings during September 1, 2022-March 31, 2023, among individuals aged>=12 years across eight VSD sites. Analyses were conducted separately for Pfizer-BioNTech and Moderna bivalent vaccines, with relative incidences (RI) and 95% confidence intervals (CI) estimated for 1-21-day and 1-42-day risk intervals, using person-time outside these intervals as the control period. Subgroup analyses were performed by age group (12-64, >65 years) and history of documented SARS-CoV-2 infection. Results: A total of 6,510 first-ever ischemic strokes were identified among more than 6.8 million participants. Among recipients of Pfizer-BioNTech bivalent COVID-19 and influenza vaccines, no statistically significant increased risk of first-ever ischemic stroke was observed following bivalent COVID-19 vaccination (RI=0.94; 95% CI: 0.63-1.41), influenza vaccination (RI=0.95; 95% CI: 0.82-1.10), or same-day coadministration (RI=1.15; 95% CI: 0.88-1.49) within 1-21-day risk intervals; findings were similar for 1-42-day intervals. Comparable null results were observed for Moderna vaccines and across all subgroups, secondary, and exploratory outcomes. Conclusion: No increased risk of ischemic stroke was found following bivalent mRNA COVID-19 vaccination, influenza vaccination, or their coadministration in this multi-site SCCS study. These findings are consistent with previous studies and underscore the importance of continued vaccine safety monitoring.
Kulkarni, P.; Ndai, A.; Keshwani, S.; Smith, K. M.; Choi, J.; Luvera, M.; Hunter, J.; Wright, S.; Hetzel, J.; Pepine, C. J.; Schmidt, S.; Morris, E.; Smith, S.
Show abstract
Background: Dihydropyridine calcium channel blockers (DHP-CCB) are widely prescribed antihypertensives whose adverse effects may trigger unnecessary prescribing of additional medications, termed prescribing cascades (PC). We aimed to identify potential DHP-CCB-induced PCs using high-throughput sequence symmetry analysis (HTSSA). Methods: Using Medicare claims data (2011-2020), we identified new users aged [≥]66 years with continuous enrollment [≥]360 days before and [≥]180 days after DHP-CCB initiation. We screened for initiation of 446 "marker" drug classes within {+/-}90 days of DHP-CCB initiation. Sequence ratios compared marker drug initiation after versus before DHP-CCB initiation. Adjusted sequence ratios (aSR), accounting for prescribing trends over time, were calculated with 95% CIs >1 considered statistically significant. Clinical experts classified statistically significant signals as potential PCs through consensus. Results: Among 388,862 DHP-CCB initiators (mean age 76.6 {+/-} 7.5 years; 62.5% women, 92.3% with hypertension), 82 of 446 marker drug classes had significantly elevated aSRs, of which 24 were classified as potential PCs. Strongest signals ranked by highest aSR included other systemic hemostatics (aSR 2.99; 95% CI, 1.10-8.16), other nasal preparations (aSR 1.99; 95% CI, 1.47-2.70), and drugs used in erectile dysfunction (aSR 1.85; 95% CI, 1.27-2.70). Other clinically relevant signals, ranked by number needed to harm (lowest to highest), included sulfonamides (NNTH 104; 95% CI, 98-111), electrolyte solutions (NNTH 216; 95% CI, 196-241), and osmotically acting laxatives (NNTH 710; 95% CI, 540-1056). Conclusion: Potential PCs identified in this Medicare cohort reflected known and underrecognized adverse effects of DHP-CCBs. Further studies are needed to evaluate the clinical consequences of these PCs.
Silverman, R. A.; Ahrens, M. L.; Helmick, M.; Finkielstein, C. V.; Cohen, A.; Short, E.; Bordwine, P.
Show abstract
Background and Objectives: SARS-CoV-2 (COVID-19) continues to mutate, circulate, and adversely impact health and quality of life. While COVID-19 vaccines remain safe and effective, uptake remains low, especially among children, the youngest of whom were not vaccine-eligible until after Omicron and are underrepresented in published research. This study estimated vaccine effectiveness (VE) among under-5-year-olds. Methods: We used Virginia Department of Health surveillance data from June 2022 through October 2022 to conduct a test negative case-control study. We estimated VE derived from odds ratios (ORs) of reported infections using logistic regression among children aged 6-months to 5-years. Results: Using the earliest positive (cases) or negative (controls) post-vaccine-eligible test results, the VE associated with two doses of a COVID-19 vaccine was 78% (95% CI=45%, 93%; p=0.004) in unadjusted analyses and 70% (95% CI=25%, 91%, p=0.023) when adjusting for age, sex, prior testing behavior, and prior reported infections. The adjusted VE was 74% (95% CI=28%, 94%; p=0.025) among those with no prior positives reported and 45% (95% CI=-302%, 97%; p=0.569) among those with a prior positive reported. Conclusions: These results show that even though the vaccine was not closely matched to the dominant variants circulating during the time period analyzed, it was effective at reducing the risk of reported infections. This study adds to the body of knowledge on pediatric COVID-19 VE in an underrepresented age-group and in a rural region, illustrates the utility of surveillance data for evaluation, and can inform vaccine decisions to improve vaccine uptake for young children.
Doan, L. V.; Hung, A. M.; Olfson, M.; Williams, N. T.; Rudolph, K. E.
Show abstract
Introduction: Acute low back pain is a leading cause of disability worldwide. Clinical guidelines recommend non-pharmacological therapies as first-line treatment and advise caution with opioid prescribing. However pharmacological therapies, including opioids and gabapentinoids, remain commonly used. The comparative risks of subsequent opioid use disorder (OUD) and overdose diagnosis associated with initial treatment modality in large, real-world populations is not well characterized. We estimated the incidence of new-onset OUD and overdose diagnosis among opioid-naive, Medicaid-insured adults with newly diagnosed acute low back pain and estimated the association between initial treatment modalities and subsequent OUD and overdose diagnosis risk. Methods: We conducted a retrospective cohort study using Medicaid T-MSIS Analytic files from 25 states (2016-2019). We identified opioid-naive adults with a new diagnosis of acute low back pain who initiated pharmacologic or non-pharmacologic treatment within 1 month of diagnosis. The primary outcome was incident OUD and overdose diagnosis (based on diagnosis codes in claims) during follow-up. Associations between initial treatment modality and OUD and overdose diagnosis risk were estimated using a non-parametric, doubly robust estimator to adjust for measured confounding. Results: The cohort included 525,002 opioid-naive adults initiating treatment for low back pain. The cumulative incidence of OUD and overdose diagnosis was 1.5% and 2.4% at 7 and 13 months, respectively. Compared to non-use, use of gabapentinoids during the first month of treatment was associated with the highest relative risk (increasing risk) by 130.1%, 95% confidence interval (CI): 117.8%, 142.3%), the second-highest relative risk was estimated for higher-dose opioids, defined as > 50 daily Morphine Milligram Equivalents (MME) (118.1%, 95% CI: 99.2%, 137.0%). Lower-dose, short-duration opioids ([≤] 50 MME, [≤] 7 days) were also associated with elevated risk, though substantially smaller in magnitude (20.8%, 95% CI: 13.8%, 27.9%). In contrast, non-pharmacologic, non-interventional therapies were associated with reduced OUD and overdose diagnosis risk, with physical therapy demonstrating the largest relative reduction of 34.0% (95% CI: -40.9%, -27.1%). Discussion: In opioid-naive Medicaid patients with acute low back pain, initial non-pharmacologic treatment was associated with reduced OUD and overdose diagnosis risk. Gabapentinoids and opioids were each associated with increased risk; for opioids, the degree of risk increased with higher doses and durations. These results support guideline recommendations favoring non-pharmacologic treatment as first-line therapy and indicate the importance of cautious prescribing when pharmacologic treatment is considered.
Hagan, J.
Show abstract
Background. Cross-validation (CV) is widely used to estimate predictive performance, but can overestimate performance when applied at the observation level to repeated-measures data. When continuous predictor variables are measured repeatedly within subjects and the binary outcome is defined at the subject level, naive observation-level CV introduces data leakage through within-subject dependence, producing optimistically biased estimates of the area under the receiver operating characteristic curve (AUROC). The magnitude of this bias and the performance of alternative partitioning strategies have not been formally characterized for this data structure. Methods. Three CV strategies were compared for estimating subject-level AUROC in ridge logistic regression models: naive observation-level 10-fold CV, subject-level 10-fold CV, and leave-one-cluster-out (LOCO) CV. The framework was applied to a motivating clinical dataset of daily oxygenation measures and retinopathy of prematurity outcomes among 101 extremely low birth weight infants. A factorial simulation study was conducted across 162 parameter combinations varying cluster count (20-150), intraclass correlation (0.1-0.5), within-cluster autocorrelation (0.2-0.8), and outcome prevalence (10-35%), with 500 simulated datasets per condition (76,389 valid datasets total). Results. In the motivating dataset, naive CV produced optimism of +0.078 AUROC units for severe ROP prediction (15 events, 101 subjects) and +0.031 for any ROP prediction (48 events). Subject-level 10-fold CV closely approximated LOCO (deviation [≤] 0.015). In the simulation, naive CV optimism ranged from +0.039 to +0.204 across all conditions, increasing monotonically with higher ICC, higher autocorrelation, fewer clusters, and lower event rates. Subject-level 10-fold CV was essentially unbiased relative to LOCO across all 162 conditions (mean absolute deviation = 0.002). Conclusions. Naive observation-level CV meaningfully overestimates discriminative performance in the repeated-measures binary outcome setting and should not be used. Subject-level CV partitioning effectively eliminates this bias. Accordingly, subject-level partitioning should be considered essential, not optional, when validating prediction models using repeated-measures data with subject-level outcomes.
Blotske, K.; Zhao, X.; Henry, K.; Murray, B.; Gao, Y.; Smith, S. E.; Wayne, N.; Ku, P.; Smith, B.; Moua, S.; Sikora, A.
Show abstract
Background: Electrolyte replacement is ubiquitous in the acute care setting, but its familiarity cannot belie that even small dosing errors with potassium can cause lethal cardiac arrhythmias. Recently, MedAgentBench offered a benchmark for agentic artificial intelligence (AI) including the ability to correctly dose potassium based on a single rule; however, this does not adequately reflect the clinical complexity or safety concerns of an agent that has been used as the lethal injection. The purpose of this analysis was to a probe leaderboard large language model (LLM) capabilities to follow basic dosing rules to safely replace potassium in a series of clinician-annotated cases. Methods: Using a clinician panel, we developed a series of dosing principles and 20 clinical cases reflective of the complexity of potassium replacement. External clinicians were surveyed to assess practice variability and agreement to clinician panel answers. We tested GPT-5-chat with each case in triplicate, with and without the clinician curated dosing principles, and prompted the model to answer six questions involving potassium goals, dosing, route, lab frequency, concurrent interventions, and the model's perceived level of confidence for the output and complexity of the case. The primary outcome was the rate of appropriate recommendations in comparison to clinician answers. Results: A total of 54 clinicians reviewed the 20 hypokalemia cases and hypokalemia dosing guideline. Clinicians expressed "highly agree" or "somewhat agree" for 66.8% of the cases evaluated when asked if they agree with the guideline-recommended management. When given the potassium dosing guideline, total errors dropped from 165 to 104, and average accuracy improved from 45% to 65% with GPT-5-Chat. GPT-5-Chat conveyed a high level of confidence for 100% of responses, while labeling 80% and 76% of cases as highly complex with and without the criteria, respectively. Potential harm scores were considerable in both groups, however, a notable reduction in severity scores occurred with the dosing guidance document. Recommendations on concurrent interventions and dosing had the highest rate of errors in both groups. Conclusions: Benchmarks must appropriately reflect clinical complexity to be considered valuable for the deployment of agentic artificial intelligence tools in the healthcare domain. GPT-5-Chat assessment on a comprehensive medication management task for potassium replacement showed improvement with dosing guidance, yet unfit benchmarking performance.
Kendzerska, T.; Reyes, J.; Poirier, N.; Poirier, A.; Cull, A.; Murkar, A.; Saymeh, M.; Belanger, S.; Williams, M.; Shlik, J.; Jetly, R.; Robillard, R.
Show abstract
Background Evidence on factors associated with cannabis for medical purposes (CMP) authorizations among Veterans Affairs Canada (VAC) clients remains limited and inconsistent, particularly concerning mental health and posttraumatic stress disorder (PTSD), a leading indication for use. We investigated demographic, clinical and service characteristics associated with VAC authorizations for CMP reimbursement. Method We linked VAC administrative CMP program data with responses from the 2019 Life After Services Studies cross-sectional survey of Regular Force veterans released between 1998 and 2018. Multivariable logistic regressions examined associations between CMP reimbursement (yes/no) and demographic, clinical and well-being factors, with analyses stratified by PTSD status. Results Among 1,289 respondents (weighted n=33,131), 18.4% were authorized for CMP reimbursement. Younger age (<40 vs. [≥]60 years: OR 4.78, 95% CI: 2.24-10.21), unemployment with inability to work vs. employed (OR 3.10, 95% CI: 1.78-5.40), land service vs. air (OR 2.07, 95% CI: 1.22-3.50), PTSD (OR 2.81, 95% CI: 1.69-4.66), anxiety (OR 2.32, 95% CI: 1.45-3.70), and severe pain vs. no pain (OR 3.61, 95% CI: 1.97-6.60) were independently associated with authorization. Unemployment and severe pain were consistent correlates across PTSD strata. Among those without PTSD, younger age, multiple physical conditions, and frequent mental health visits were significant; among those with PTSD, shorter service, witnessing destruction, and suicidal ideation were additional factors. Conclusions CMP authorization patterns among Canadian veterans reflect the intersection of mental health, pain, and functional impairment, with variation by PTSD status. These findings underscore the need for longitudinal research on CMP mechanisms, effectiveness and safety.
Urquijo, H.; Goldfine, A. B.; Casas, J. P.; Xu, H.; Timsit, Y. E.; Mendelson, M. M.; Hache, C.; Jones, I.; Arustamian, D.; Magnus, M. C.; Gaunt, T. R.; Lawlor, D. A.; Borges, M. C.
Show abstract
Background: Lipoprotein(a) (Lp[a]) is a genetically determined causal and independent cardiovascular risk factor and Lp(a) targeted therapies are being developed. However, evidence on the safety of substantial Lp(a) lowering during pregnancy is limited. We evaluated the impact of Lp(a) lowering on adverse pregnancy and perinatal outcomes (APPOs) using human genetic evidence. Material and Methods: We applied a drug-target Mendelian randomization (MR) approach using genetic variants associated with Lp(a) in the UK Biobank at the LPA locus to proxy pharmacological Lp(a) lowering. Summary-level APPO data were obtained from the MR-PREG collaboration, comprising up to 714,899 women across multiple studies. Twenty APPOs were included. Sensitivity analyses included adjustment for fetal genotype, alternative Lp(a) datasets, leave-one-study-out analyses, and exploration of Lp(a) genetic scores and individuals homozygous for LPA loss-of-function variants in the UK Biobank. Results: Across 20 APPOs, MR estimates showed no strong evidence of causal effects, with no associations surviving false discovery rate P-value correction. Most estimates were close to null, including gestational hypertension, gestational diabetes, preeclampsia, miscarriage and neonatal intensive care unit admission. Some associations were slightly larger in magnitude but with wide confidence intervals: gestational age (mean difference 0.04 weeks, 95% CI 0.02-0.06 per 210nmol/L reduction in Lp[a]) and congenital malformation (OR 0.82, 95% CI: 0.72-0.94) in the protective direction of effect, and higher odds of stillbirth (OR 1.09, 95% CI: 1.00-1.19) and low Apgar at 1 minute (OR 1.11, 95% CI: 0.99-1.24). Sensitivity analyses consistently supported the primary findings, with no evidence of increased maternal nor offspring risk in analyses adjusting for maternal-fetal genotype, across alternative exposure datasets, or in leave-one-study-out tests. Individual-level analyses of Lp(a) genetic score and LPA loss-of-function variants showed no associations, although power was limited. Conclusion: These findings suggest that substantial lowering of Lp(a) is unlikely to increase APPO risk, although modest effects, particularly for rare outcomes, cannot be excluded.
Matos Porto, A. P.; Gomes, M. S.; de Oliveira, V. F.; Mwanja, H.; Zhu, N.; Holmes, A.; Levin, A. S.; Costa, S. F.
Show abstract
Background: Digital antimicrobial stewardship (AMS) interventions, such as clinical decision support systems, audit and feedback platforms, and electronic prescribing tools, have been increasingly adopted to improve antibiotic use. However, the effectiveness of these interventions across healthcare settings remains uncertain, and the certainty of the evidence has not been comprehensively evaluated. The objective of this study was to provide a comprehensive understanding of the role of digital interventions in optimizing antimicrobial use and improving clinical outcomes within a broad spectrum of healthcare settings. Methods: We conducted a systematic review and meta-analysis of randomized controlled trials evaluating digital AMS interventions that followed PRISMA 2020 guidelines and registered in PROSPERO CRD420251178854 and funded by the Wellcome Trust CAMO Net programme. Searches were performed across major databases. Primary outcomes included the appropriateness of antibiotic prescriptions and the antibiotic prescription rate. Secondary outcomes included 30 day mortality, 30 day hospital readmission, and length of hospital stay (LOS). Random effects models were used to pool effect sizes. Risk of bias was assessed RoB 2, and certainty of evidence was rated using GRADE. A Summary of Findings table was prepared to present effect estimates, sample sizes, and evidence certainty. Results: Eleven RCTs met the inclusion criteria, and nine were included in the quantitative synthesis. Digital AMS interventions did not show a significant effect on appropriateness of antibiotic prescribing (RR 0.99, 95%CI 0.93 to 1.05; very low certainty). There was no reduction in antibiotic prescription (RR 0.98, 95%CI 0.88 to 1.09), with substantial statistical heterogeneity and very low certainty. Across clinical outcomes, digital AMS showed no effect on 30 day mortality (RR 0.91, 95%CI 0.77 to 1.09; very low certainty) or 30 day readmission (RR 0.95, 95%CI 0.79 to 1.14; very low certainty). For LOS, results were inconsistent across studies, and the pooled effect showed no clinically meaningful change (MD 0.17 days, 95%CI 0.01 to 0.35; very low certainty). Most trials had some concerns of bias due to deviations from intended interventions. Conclusion: Meta-analyses of digital AMS RCTs showed a lack of evidence with a high level of certainty on antibiotic prescribing or clinical outcomes due to high heterogeneity in interventions and study designs, as well as RCTs' limitations (no adoption/fidelity metrics).
Bonilla, K.; Sherman, V. M.; Arbaiza, A. S.; Dougherty, M.; Olson, L. E.
Show abstract
In some countries, melatonin is sold without a physician prescription and dosage is unregulated. Transdermal products have become popular including those marketed for children. We measured consumer assumptions about these products among adult residents of the United States, analyzed lot-to-lot variability, and compared the pharmacokinetics of melatonin administered in oral, lotion, and bath product forms. Survey respondents (n=199) believed oral melatonin was more effective than transdermal products and that all melatonin products were relatively safe. Melatonin lotion products analyzed by HPLC displayed lot-to-lot variability as well as changes in formulation and product claims. To determine pharmacokinetics, three different treatments (oral tablets, lotion, and bath immersion) were administered to twelve undergraduate participants in a randomized, crossover design. Five additional participants completed bath product treatment only. Participants collected saliva samples up to 48 hours after administration, which were analyzed for melatonin by enzyme-linked immunosorbent assay. Oral (n=11) and lotion formulations (n=12) caused maximum salivary melatonin levels within 30 minutes after administration, but bath immersion did not cause increases in saliva melatonin (n=17). The half-life of oral melatonin was 1.17 [0.69 -- 1.65] hours versus 5.72 [3.75 -- 7.68] hours for lotion treatment (p = 0.011, effect size r = 0.770). Melatonin lotion may pose a risk to consumers who assume it is safe and less effective than oral tablets, when in fact it may be very potent and remain at high physiological levels into the following day. This study is registered on clinicaltrials.gov (NCT06382610) and was funded by the Sleep Research Society.